Report on the collation of Brassica A and C genome Glutamine Synthetase gene sequences
نویسندگان
چکیده
Oilseed rape (OSR) has a significant environmental footprint which is increasing in part due to the larger acreages now being grown as feedstock for biodiesel. This environmental footprint is largely attributable to the greenhouse gas emissions associated with the current requirement for high nitrogen (N) fertiliser inputs. This could be addressed by increasing the nitrogen use efficiency of OSR. In this study we have compiled available sequence data in order to assess the gene composition of OSR for a key N assimilatory gene, glutamine synthetase (GS). A range of evidence suggests that genetic variation in this gene could be used to produce improvements in NUE. We have found a total of 15 different loci present in the Brassica A and C genomes which represent 10 different homoelogous loci, i.e. loci located at equivalent positions in the A and C genomes, and conclude that this is probably an underestimate. Analysis of the genetic variation at each locus would be greatly facilitated by the availability of locus-specific assays. One of the main barriers to achieving this is the high degree of sequence conservation between the A and C genome homoeologues and knowledge of sequence of all of these is required to reliably design the assays. Current sequence data is primarily limited mostly to the very highly conserved coding regions. We have started to address this by generating new genomic sequence data for 3 GS loci. Together the compilation of the currently available sequence data and the new sequence data adds significantly to our understanding of GS loci in OSR and will assist in the genetic analysis of this gene family to search for alleles that that may contribute to improvements in NUE. Introduction Nitrogen (N) is one of the essential elements that plants need to obtain from their environment and comprises a significant proportion of their biomass. The results of Broadley et al (2004) show the average shoot organic N content to be about 4.3% of the dry weight and the total N content to be 6.1% across a phylogenetically diverse range of species. The report on the diversity analysis of shoot mineral accumulation in B. napus accompanying this report found the average shoot organic N content to be 3.3% of the dry weight, ranging from 1.7% to 5.1%. Leguminous plants can fix N2 directly from the atmosphere with the aid of Rhizobium bacteria harboured in specialised root nodules. Other plants obtain nitrogen from minerals in the soil, primarily as nitrates and ammonium. These nitrogen containing ions are taken up by specialised transporters in the roots and assimilated into organic compounds as shown in Fig 1. Figure 1 – Pathway for N assimilation in plants Nitrate (NO3) is the most oxidised mineral form of N and is reduced to ammonium (NH4) by a two-step process, first to nitrite (NO2) by nitrate reductase (NR) and then to ammonium by nitrite reductase (NiR). The next step is the reaction of ammonium with glutamate (Glu), catalysed by glutamine synthetase (GS; also known as glutamate ammonia ligase), to produce glutamine (Gln) the first organic N compound. In order to maintain a supply of glutamate to feed this reaction another crucial enzyme, glutamate oxoglutarate amino trasferase (GOGAT), catalyses the transamination of the amino group of Gln onto 2-oxoglutarate (2-OG) to form 2 molecules of Glu in a cycle known as the GOGAT cycle. Glu can then either be fed back again as a substrate for GS or transported and used as a source of N for the production of other nitrogenous compounds. The assimilation of N from soil minerals, however, is not the end of the assimilation story. As plants are sessile, it is important that they conserve the N that they have accumulated. Various processes within plants result in the conversion of the organic N back to ammonium. Ammonium exists in equilibrium with ammonia gas (NH4 + OH NH3 + H2O) and NH3 is permeable through cell membranes and so is at risk of being lost from the plant by diffusion. In addition, NH3 is a reactive cytotoxic molecule so only low levels can be tolerated. By far the most significant process leading to the production of ammonium, by a factor or more than 10-fold compared with the assimilation from soil N, is photorespiration, which is a reaction associated with photosynthesis (see Linka et al 2005). Other processes such as the remobilisation of nitrogen as plants go through developmental transitions, including germination, flowering and seed production, and from the recovery of N from senescing leaves, also result in significant production of ammonium which needs to be reassimilated. Glutamine synthetase (GS) It is generally thought that all mineral N is incorporated into organic N via the GS enzyme. GS therefore has a key central role in plant nitrogen metabolism. In fact GS is crucial for life and is recognised as one of the oldest classes of enzymes known, being possessed by all living organisms from bacteria to eukaryotes. The central role of GS is also highlighted by the fact that it is the target of the broad spectrum herbicide phosphinothricin (also known as glufosinate and BASTA) which acts as a potent inhibitor. Plant GS has been the subject of several recent reviews (Cren & Hirel, 1999; Miflin & Habash, 2002). Plant GSs are classified into two isoforms, referred to as GS1 and GS2, and all are encoded in the nuclear genome. GS1 is located in the cytoplasm and is encoded by a multigene family (5 copies in Arabidopsis) with different genes possessing distinct expression profiles. GS2 is usually referred to as the plastidic form, although the GS2 gene from Arabidopsis has been shown to be dual targeted to mitochondria as well (Taira et al, 2004). In most plant species characterised GS2 is encoded by a single gene, although three copies have now been identified in hexaploid wheat (Bernard et al, 2008). The primary function of GS2 in leaves appears to be the reassimilation of ammonia produced during photorespiration, but it is also expressed in a range of other tissues. The different GS1 genes on the other hand are expressed in different tissues within the plant, including roots (Ishiyama et al 2004) and leaves where specific genes may be upregulated during senescence (e.g. Buchanan-Wollaston & Ainsworth, 1997). In order to fulfil all the ammonium assimilation demands of a plant the different members of the GS gene family are controlled by a sophisticated regulatory mechanism (Bernard et al, 2008) which involves expression changes in response to minerals and environmental signals such as light, expression in specific tissues and cell types, specific subcellular compartmentalisation and regulation at post-transcriptional and post-translational levels. In addition, in plants GS is an octameric enzyme, that is eight protein GS subunits bind to form the active holoenzyme complex. This gives further potential scope for fine-tuning the regulation through the possibility of incorporating the subunits from different genes in the same holoenzyme. Numerous studies have probed the importance of GS in nitrogen use efficiency (NUE) by testing different strategies of modulating its activity. The overexpression of either GS1 or GS2 has been performed in a range of plant species and the effects on ammonium accumulation and plant biomass assessed. Examples of these studies include Fuentas et al (2001) who overexpressed GS1 in tobacco and found no effect under optimum N supply conditions, but increased photosynthetic rates and plant biomass was observed in plants starved of N. Habash et al (2001) overexpressed GS1 in wheat leaves and found increases in root and grain dry matter and Oliveira et al (2002) overexpressed GS1 in tobacco and observed a light-dependent increase in growth under high and low N supply conditions. Fei et al (2003) produced a range of transgenic pea plants with soybean GS overexpressed in different tissues. In one line with increased root GS levels an increase in root biomass was obtained when plants were grown at a range of N levels. However, a second line did not show this and overepxression in nodules and leaves had no effect on GS activities and plant biomass. The results of other studies showed less effect. However, this may in part be explained by the complex regulation of GS levels such that increased GS expression does not necessarily lead to an increase in GS activity. In an alternative study Husted et al (2002) produced oilseed rape GS2 antisense lines which had 5075% lower GS2 levels. They found that no increase in ammonium accumulation occurred indicating that a surplus of GS was maintained by the plants. However, in rice lines carrying homozygous retrotransposon-induced GS1 mutants which lacked detectable GS1 protein in the leaves Tabuchi et al (2005) found severe retardation of growth and grain filling. The other endogenous wild-type GS genes also present in these mutants were unable to complement the affect of the mutants. This indicates that individual GS genes may perform specific essential roles within a plant. One observation from these studies is that a beneficial effect of increased GS activity may be more pronounced under N-limiting conditions than under a plentiful supply of N. Another approach which has provided further support for a possible central role for GS in NUE has been the identification of quantitative trait loci (QTL) for NUE-related traits that are linked to loci affecting GS activity. In rice Obara et al (2001) identified 7 QTLs for GS1 protein content and also mapped one GS gene. These genomic locations overlapped with QTLs for spikelet weight and leaf senescence with the conclusion that GS1 in the leaf blade is an important factor for grain filling. Several studies in maize have mapped GS genes and multiple QTLs for GS activity and found correspondence between these and QTLs for N re-mobilisation, kernel weight and germination efficiency (Hirel et al, 2001; Limami et al, 2002; Gallais & Hirel 2004) leading to the conclusion that GS activity is also important for grain filling in maize. In the Arabidopsis Columbia x Landsberg eracta population Rauh et al (2002) identified a number of growth related QTLs in plants supplied with different nitrogen sources. GS was highlighted as a candidate gene underlying a QTL on chromosome 5. Most recently, Habash et al (2007) found positive correlations between QTLs for total leaf GS activity and grain and stem N in a bread wheat mapping population, although no or negative correlations were obtained with grain yield components. While the genes responsible for these QTLs have not been demonstrated, taken together the results are nevertheless suggestive that GS has an important role in various aspects of plant growth and performance. In particular, Andrews et al (2004) conclude that for cereals the production of plants with increased levels of GS1 in senescing leaves could help reassimilation of the released N for the benefit of seed production. GS gene structure As previously stated, the model plant Arabidopsis possess 6 GS genes, 1 plastidic and 5 cytoplasmic isoforms. Brassicas have evolved from a common ancestor with Arabidopsis via a hypothesised triploid intermediate that has subsequently evolved to form the present day diploid Brassica genomes. These still retain considerable regions of genomic duplication and triplication compared with Arabidopsis. We thus anticipate that the A and C genomes individually are likely to possess a complex family of GS genes and that both sets will be combined in oilseed rape (B. napus). In Brassicas several GS genes have been sequenced already, most notably 6 from B. napus by Ochs et al (1999), together with a full-length genomic clone and an additional cDNA from B. rapa (Zhang et al 2006). Figure 2 shows the complex intron structure of these genes. Figure 2. Gene structure of B. napus GS genes The genomic sequence of the published B. napus plastidic/mitochondrial GS2 gene (accession number AJ271909) is indicated by a thick black line and the positions of the introns are given by green boxes (diagram approximately to scale). The GS2 gene contains 12 exons with the first intron being located in the 5’ untranslated region. An N-terminal organellar targeting sequence is indicated by a blue box and this is cleaved off after import into the organelle. Published B. napus GS1 genes have a conserved sequence length that can be matched with introns 2-12 of GS2 and here are drawn with conserved intron positions. In the present study, on the basis of its key role in N assimilation, we hypothesise that variation in GS activity is a component of the genetic variation in NUE in oilseed rape. One approach to investigating this is to examine GS sequence variation and look for correlations with component traits associated with NUE or identify plants possessing novel GS alleles that could be subjected to detailed phenotypic analysis. In order look at GS sequence variation it is first necessary to establish the extent of the gene family in B. napus and determine the sequences of each family member. Sequence differences between the genes would permit the design of gene-specific PCR primers that would enable each locus to be selectively amplified and sequenced from different plants. Materials and methods Database searches and sequence compilation Online sequence databases were searched using the NCBI BLAST service at http://www.ncbi.nlm.nih.gov/blast/Blast.cgi for Brassica napus (AC genome) B. oleracea (C genome) and B. rapa (A genome) sequences homologous to published GS sequences (Ochs et al, 1999). Searches encompassed all appropriate mRNA and genomic sequence categories in the database. Retrieved sequences, together with the sequences from the opposite ends of clones where available, were catalogued and compared with each other. Sequence assembly and analysis used the Lasergene DNASTAR suite of software. Many of the sequences derived from high throughput screening approaches, such as the GSS and EST sequences, required individual identification and removal of contaminating flanking vector sequences that had failed automated trimming procedures employed prior to database submission. Comparative alignments of the sequences enabled them to be grouped into distinct gene classes and then assembled into contigs where they were judged to correspond to identical loci from the same genotype. A number of sequence polymorphisms (SNPs) were apparent when compiling EST and GSS sequences into contigs, but where they occurred on their own they were usually considered to be sequencing errors. Where the same SNPs were present in more than one sequence this was used as evidence for dividing the sequences into separate contigs. However, this process was sometimes a judgement call and errors in compilation may have been made. The sequence contigs were saved as new sequence files. B. napus sequences were compared to the B. oleracea and B. rapa sequences to determine the most likely genome to which they corresponded. To perform sequence alignments and determine the positions of intron/exon boundaries, all the sequences were aligned to the B. napus cv Drakkar full-length GS2 genomic clone (Accession number AJ271909, here renamed as BnaA.GLN2.a), which was used as a reference framework for locating the positions of introns and exons. This alignment was subsequently used to guide extraction of cDNA sequences which were compiled into a separate alignment. These were themselves used to generate protein sequences and a separate alignment of these produced. PCR primer design, amplification, cloning and sequencing Preliminary versions of the final sequence alignment presented below were used in attempts to design primers that would discriminate between the different loci. A range of primers were tested, but under the PCR conditions used either gave no product or where clearly non-specific. Two primers that gave single bands on a gel of the expected size and corresponding to GS sequences are: GS2 primers: BO-PL-GS12-5f – 5’ TCCCAGGGATCCAAGCTTTAAAC 3’ (located in intron 1 of GLN2 at position 2042 to 2064 in the sequence alignment) BO-PL-GS12-3r – 5’ CATTTATTATCCCGGCAAGACTACTTAG 3’ (located in the 3’UTR of GLN2 genes at position 4955 to 4982 in the sequence alignment) GS1 primers: R1PCR5 (Ochs et al 1999) – 5’ ACCTTCTTGTCATTTTCTC 3’ (located in intron 1 at position 2213 to 2231 in the sequence alignment) BOCYGSR123R – 5’ AAACACAAACTTGAAGCCCCAGAG 3’ (located in intron 1 at position 4898 to 4921 in the sequence alignment) GS1 internal sequencing primer: CYINTR – 5’ GATTGATGCTCCACGGTTTG3’ (located in exon 11 at position 4607 to 4626 in the sequence alignment) Preparation of genomic DNA templates. Genomic DNA was prepared from the following plant material using the Qiagen DNeasy 96 Plant Kit: 1. B. oleracea – A12DHd and GDDH33 (parent lines of the AGDH mapping population); CA25 and AC498 (parent lines of the NGDH mapping population) 2. B. rapa – Chiifu-401-42 and Kenshin-402-43 (parent lines of the CKDH mapping population) 3. B. napus – Tapidor DH and Ningyou 7 (parent lines of the TNDH mapping population) The genomic DNA preps were then amplified using the GenomiPhiTM whole genome amplification kit. PCR amplification, cloning and sequencing. 10μl PCR reactions were set up containing 10 pmol each primer, 10 ng genomic GenomiPhi amplified genomic DNA template, and 1 unit Hot Star Taq polymerase in the supplied buffer (Qiagen). Reactions were amplified in a ABI 9700 PCR machine using the following cycling conditions: 1 cycle of 95°C for 15 mins followed by 35 cycles of 30 s at 94°C, 3 min at 55°C and 30 s at 67°C. The products of reactions were evaluated on an agarose gel. Reactions containing single bands were subcloned into the pMOSBlue vector using a Blunt-ended PCR Cloning Kit (GE Healthcare). Subclones containing the expected size inserts were first amplified with Templify (Amersham) and sequenced using M13 forward and reverse primers using the ABI Big Dye sequencing kit and run on an ABI3130XL Genetic Analyser. Further sequencing reactions were performed with an internal sequencing primer for some clones and contiguous sequences of the clones assembled. Where sequences have yet to be completed using internal primers N residues were substituted between the forward and reverse sequencing reactions to correspond to the predicted exonic regions. The newly generated sequences were added to the complete alignment of all the deduced Brassica sequences Results and discussion A summary of all the sequence accessions and the genes they correspond to can be found here. The alignment of all the genomic and cDNA sequences is given in Figure 3. The positions of the exons for the 12 introns of GS2 relative to the alignment in Figure 3 are: Exon 1 – approx 1491 to 1592 (there is an intron in the 5’UTR) Exon 2 – 2060 (but differs between genes) to 2314 Translation start site of GS2 – 2070 Exon 3 – 2506 to 2545 Exon 4 – 2671 to 2774 Exon 5 – 2868 to 2916 Exon 6 – 3004 to 3110 Exon 7 – 3450 to 3537 Exon 8 – 3637 to 3765 Exon 9 – 3906 to 3980 Exon 10 – 4073 to 4503 Exon 11 – 4600 to 4660 Exon 12 – 4765 to 5083 (polyadenylation site for some mRNAs) Stop codon – 4902 to 4904 For the GS1 genes translation start is at – 2241 GS1 stop codon is at 4859 to 4861 For GS1 genes there is no intron in the 5’UTR so there are only 11 exons. The GS1 exon numbering therefore is one less than for GS1 genes, i.e. exon 1 of GS1 corresponds to exon 2 of GS2 BolGSR1-2 has an extra intron in exon 10 at – 4258 to 4353 BolGSR5-1 has an extra intron in exon 10 at – 4127 to 4219 The alignment of all the deduced open reading frames is given in Figure 4 The alignment with all the protein coding sequences, including those for Arabodopsis GS genes is given in Figure 5 Genes that are evolutionarily corresponding partners in the A and C genomes, i.e. that are located in collinear genomic regions that have arisen following speciation of B. rapa and B. oleracea from a common ancestor, but after the evolution of the replicated paralogous loci, are referred to as homoeologous. The relationships between the different homoeologous loci are summarised in Table 1. Table 1. Summary of the homoeologous relationships between the GS genes Homoeologies are indicated by genes on the same line B. oleracea C genome B. napus C genome B. napus undetermined genome B. napus A genome B. rapa A genome BolGLN2_1 BnaGLN2_2 BnaGLN2_1 BraGLN2_1 BolGLN2_2 BnaGLN2_3 BnaGLN2_4 BraGLN2_2
منابع مشابه
بررسی تغییرات کمّی و کیفی میزان پروتئین، کلروفیل و کاروتنوئید در کلزای تراریخت شده با آنتی سنس ژن گلوتامین سنتتاز (GS1)
Analysis of transgenic plants is very important in gene transfer programs. In this research, the second generation (T1) of transgenic brassica napus which was transformed by antisense of Glutamine synthetase (GS) gene was studied from the view of total soluble protein content of leaf, total chlorophyll and protein patterns (SDS-PAGE) using seeds of Brassica napus .Protein concentration was dete...
متن کاملبررسی تغییرات کمّی و کیفی میزان پروتئین، کلروفیل و کاروتنوئید در کلزای تراریخت شده با آنتی سنس ژن گلوتامین سنتتاز (GS1)
Analysis of transgenic plants is very important in gene transfer programs. In this research, the second generation (T1) of transgenic brassica napus which was transformed by antisense of Glutamine synthetase (GS) gene was studied from the view of total soluble protein content of leaf, total chlorophyll and protein patterns (SDS-PAGE) using seeds of Brassica napus .Protein concentration was dete...
متن کاملSixteen cytosolic glutamine synthetase genes identified in the Brassica napus L. genome are differentially regulated depending on nitrogen regimes and leaf senescence
A total of 16 BnaGLN1 genes coding for cytosolic glutamine synthetase isoforms (EC 6.3.1.2.) were found in the Brassica napus genome. The total number of BnaGLN1 genes, their phylogenetic relationships, and genetic locations are in agreement with the evolutionary history of Brassica species. Two BnaGLN1.1, two BnaGLN1.2, six BnaGLN1.3, four BnaGLN1.4, and two BnaGLN1.5 genes were found and name...
متن کاملEvaluation of First and Second Markov Chains Sensitivity and Specificity as Statistical Approach for Prediction of Sequences of Genes in Virus Double Strand DNA Genomes
Growing amount of information on biological sequences has made application of statistical approaches necessary for modeling and estimation of their functions. In this paper, sensitivity and specificity of the first and second Markov chains for prediction of genes was evaluated using the complete double stranded DNA virus. There were two approaches for prediction of each Markov Model parameter,...
متن کاملGene Family: Structure, Organization and Evolution
Gene families are considered as groups of homologous genes which they share very similar sequences and they may have identical functions. Members of gene families may be found in tandem repeats or interspersed through the genome. These sequences are copies of the ancestral genes which have underwent changes. The multiple copies of each gene in a family were constructed based on gene duplicati...
متن کامل